Discovering Unknown Patterns in Free Text
نویسنده
چکیده
Copyright © 2006, Idea Group Inc., distributing in print or electronic forms without written permission of IGI is prohibited. INTRODUCTION A very large percentage of business and academic data is stored in textual format. With the exception of metadata, such as author, date, title and publisher, these data are not overtly structured like the standard, mainly numerical, data in relational databases. Parallel to data mining, which finds new patterns and trends in numerical data, text mining is the process aimed at discovering unknown patterns in free text. Owing to the importance of competitive and scientific knowledge that can be exploited from these texts, “text mining has become an increasingly popular and essential theme in data mining” (Han & Kamber, 2001, p. 428). Text mining has a relatively short history: “Unlike search engines and data mining that have a longer history and are better understood, text mining is an emerging technical area that is relatively unknown to IT professions” (Chen, 2001, p. vi).
منابع مشابه
A review of text mining approaches and their function in discovering and extracting a topic
Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling. Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...
متن کاملDiscovering Evolutionary Theme Patterns from Text ∗ CS 598
Temporal Text Mining (TTM) is concerned with discovering temporal patterns in text information collected over time. Since most text information bears some time stamps, TTM has many applications in multiple domains, such as summarizing events in news articles and revealing research trends in scientific literature. In this paper, we study a particular TTM task – discovering and summarizing the ev...
متن کاملUsing Deep Learning Towards Biomedical Knowledge Discovery
A vast amount of knowledge exists within biomedical literature, publications, clinical notes and online content. Identifying hidden, interesting or previously unknown biomedical knowledge from free text resources using an automated approach remains an important challenge. Towards this problem, we investigate the use of deep learning methods that have shown significant promise in identifying hid...
متن کاملارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملStandard Addition Connected to Selective Zone Discovering for Quantification in the Unknown Mixtures
Univariate calibration method is a simple, cheap and easy to use procedure in analytical chemistry. A univariate analysis will be successful if a selective signal can be found for the analyte(s). In this work, two simple ways were used to find the selective signals, spectral ratio plot (SRP) and loading plot (LP). Both of them were able to discover the selective regions in the recorded data set...
متن کامل